Chapter 01 — Overview

LangChain & LangGraph from Scratch

A complete technical guide — from zero to production-grade AI agents.

What are we learning?

LangChain is the most popular framework for building LLM-powered applications. It provides composable building blocks: prompts, models, chains, memory, tools, and agents.

LangGraph is LangChain's newer extension that models agent logic as a stateful graph — giving you fine-grained control over multi-step reasoning, loops, and branching workflows.

🔗 LangChain

Composable primitives for LLM apps. Best for: RAG pipelines, chatbots, single-agent tasks.

🕸️ LangGraph

Stateful graph execution engine. Best for: multi-step agents, loops, human-in-the-loop workflows.

⚡ When to use which

Use LangChain for linear pipelines. Use LangGraph when you need branching, retries, or persistent state.

🏗️ They work together

LangGraph nodes typically use LangChain components: LLMs, tools, prompts, and memory.

The Stack

Your Application ↑ LangGraph ← orchestrates agent flow (nodes, edges, state) ↑ LangChain ← provides LLMs, prompts, tools, memory, retrievers ↑ LLM APIs ← OpenAI / Anthropic / Gemini / local models

Installation

bash

# Install core packages
pip install langchain langchain-openai langchain-community
pip install langgraph
pip install python-dotenv  # for API key management

💡 Tip You'll need an OpenAI API key (or any supported LLM). Set it as OPENAI_API_KEY in a .env file or environment variable.

Chapter 02 — LangChain Basics

LLMs & Prompt Templates

The two most fundamental building blocks in any LangChain application.

1. Calling an LLM

LangChain wraps LLM providers behind a unified interface. You can swap ChatOpenAI for ChatAnthropic, ChatGoogleGenerativeAI, etc. without changing your app logic.

python

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

# Initialize the model
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)

# Simple invocation
response = llm.invoke([HumanMessage(content="What is LangChain?")])
print(response.content)  # AIMessage.content is the text

2. Prompt Templates

Hard-coding prompts is brittle. Prompt Templates let you define reusable prompt structures with variables.

python

from langchain_core.prompts import ChatPromptTemplate

# Define a template with {topic} variable
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that explains {domain} concepts."),
    ("human", "Explain {topic} in simple terms.")
])

# Format the prompt with actual values
formatted = prompt.invoke({
    "domain": "machine learning",
    "topic": "gradient descent"
})

response = llm.invoke(formatted)
print(response.content)

3. The LCEL Pipe Operator

LangChain Expression Language (LCEL) lets you chain components with the | operator — creating clean, readable pipelines.

python

from langchain_core.output_parsers import StrOutputParser

# Chain: prompt → llm → parse to string
chain = prompt | llm | StrOutputParser()

# Invoke in one line!
result = chain.invoke({"domain": "AI", "topic": "embeddings"})
print(result)  # plain string output

# Streaming is built-in
for chunk in chain.stream({"domain": "AI", "topic": "embeddings"}):
    print(chunk, end="", flush=True)

🔑 Key Insight LCEL chains are lazy — they don't execute until you call .invoke(), .stream(), or .batch(). Every component in a chain must have matching input/output types.

Chapter 03 — LangChain

Chains & RAG Pipelines

Compose complex workflows by chaining LangChain components together.

What is a Chain?

A chain is a sequence of components where the output of one becomes the input of the next. The most powerful use case is Retrieval-Augmented Generation (RAG) — grounding LLM responses in your own documents.

RAG Pipeline: User Question ↓ Embeddings ← convert question to a vector ↓ Vector Store ← find similar document chunks ↓ Prompt ← inject retrieved context ↓ LLM ← generate grounded answer ↓ Answer

Building a RAG Chain

python

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# 1. Create a vector store from your documents
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_texts(
    texts=[
        "LangChain is a framework for building LLM apps.",
        "LangGraph adds stateful graph execution to LangChain.",
        "FAISS is a library for efficient similarity search.",
    ],
    embedding=embeddings
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# 2. RAG prompt template
rag_prompt = ChatPromptTemplate.from_template("""
Answer based ONLY on the context provided.

Context: {context}

Question: {question}
""")

# 3. Build the chain
llm = ChatOpenAI(model="gpt-4o-mini")

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | rag_prompt
    | llm
    | StrOutputParser()
)

# 4. Ask a question!
answer = rag_chain.invoke("What is LangGraph?")
print(answer)

Output Parsers

Output parsers transform raw LLM text into structured data your code can use.

python

from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel

class MovieReview(BaseModel):
    title: str
    rating: int
    summary: str

parser = JsonOutputParser(pydantic_object=MovieReview)

prompt = ChatPromptTemplate.from_template(
    "Review the movie '{title}'.\n{format_instructions}",
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

chain = prompt | llm | parser
review = chain.invoke({"title": "Inception"})
print(review.rating)   # guaranteed int!

Chapter 04 — LangChain

Memory & Chat History

LLMs are stateless — memory is how you give them context across turns.

The Problem

Every LLM call is independent. If you ask "What did I just say?", the model has no idea — unless you pass the conversation history explicitly.

Approach 1: In-Memory (Simple Chatbot)

python

from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import MessagesPlaceholder

# Store for chat histories (keyed by session_id)
store = {}

def get_session_history(session_id: str):
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]

# Prompt that includes message history placeholder
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}"),
])

chain = prompt | llm | StrOutputParser()

# Wrap with memory management
with_memory = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
)

# Conversation turn 1
r1 = with_memory.invoke(
    {"input": "My name is Amir."},
    config={"configurable": {"session_id": "user_1"}}
)
# Conversation turn 2
r2 = with_memory.invoke(
    {"input": "What's my name?"},
    config={"configurable": {"session_id": "user_1"}}
)
print(r2)  # → "Your name is Amir."

Memory Types Comparison

Type	How it works	Best for
`InMemory`	Full history in RAM	Development, short sessions
`RedisChatHistory`	Persisted in Redis	Production multi-user apps
`SQLChatHistory`	Stored in SQL DB	Auditable chat logs
Summarization	Compress old turns with LLM	Very long conversations
Vector memory	Embed + retrieve relevant turns	Long-term semantic recall

⚠️ Watch out Full in-memory history grows unbounded. For production, always implement trimming or summarization to stay within the LLM's context window.

Chapter 05 — LangChain

Tools & Agents

Give your LLM the ability to take actions — search the web, run code, call APIs.

What are Tools?

A Tool is a function the LLM can choose to call. You define what it does; the LLM decides when to call it and with what arguments. This is the core of agentic behavior.

python

from langchain_core.tools import tool

# Define custom tools with the @tool decorator
@tool
def get_weather(city: str) -> str:
    """Get current weather for a city. Use this when asked about weather."""
    # In production: call a weather API
    return f"Weather in {city}: 22°C, partly cloudy"

@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression like '2 + 2' or 'sqrt(16)'."""
    try:
        return str(eval(expression))
    except:
        return "Error: invalid expression"

# Bind tools to the LLM
tools = [get_weather, calculate]
llm_with_tools = llm.bind_tools(tools)

Creating a ReAct Agent

A ReAct agent follows the Reason → Act → Observe loop: think about what to do, call a tool, observe the result, repeat until done.

python

from langgraph.prebuilt import create_react_agent

# LangGraph provides a ready-made ReAct agent
agent = create_react_agent(
    model=llm,
    tools=tools,
    state_modifier="You are a helpful assistant. Use tools when needed."
)

# Run it
result = agent.invoke({
    "messages": [{"role": "user", "content": "What's the weather in Montreal and what's 15 * 7?"}]
})

# The agent will call BOTH tools before responding
print(result["messages"][-1].content)

🔑 The ReAct Loop The agent keeps calling tools and observing results in a loop until it decides it has enough information to give a final answer — or a max_iterations limit is hit.

🧠 Quick Check

What makes an LLM "agentic"?

A. Using a larger model

B. The ability to choose and call tools in a loop

C. Having more memory

D. Using streaming responses

Chapter 06 — LangGraph

Why LangGraph?

LangChain's agent loop is powerful but limited. LangGraph gives you full control.

The Limitations of Simple Agents

The basic ReAct agent works for simple tasks. But real production agents need:

🔀 Branching Logic

Route to different workflows based on intent, confidence, or intermediate results.

🔁 Loops & Retries

Retry failed tool calls, iterate on drafts, or run reflection loops.

👤 Human-in-the-Loop

Pause execution and wait for human approval before critical actions.

💾 Persistent State

Checkpoint and resume long-running workflows across sessions.

LangGraph's Mental Model

LangGraph models your agent as a directed graph:

Nodes = functions that transform state Edges = connections between nodes State = a typed dict passed through all nodes START ↓ [classify_intent] ← node: run LLM to classify ↓ ↘ [search_web] [answer_from_kb] ← conditional edge ↓ ↓ [format_response] ↓ END

LangGraph vs Simple Agent

Feature	Simple Agent	LangGraph
Branching	❌ Linear only	✅ Full conditional routing
Loops	⚠️ Black box	✅ Explicit, controllable
State	❌ Message list only	✅ Typed, custom state
Checkpointing	❌ No	✅ SQLite, Redis, etc.
Debugging	❌ Hard	✅ Step-by-step traces
Human-in-loop	❌ No	✅ Built-in interrupt

Chapter 07 — LangGraph

Graphs, Nodes & Edges

The three primitives that make up every LangGraph application.

Your First Graph

python

from langgraph.graph import StateGraph, START, END
from typing import TypedDict

# 1. Define the State schema — what data flows through the graph
class GraphState(TypedDict):
    messages: list
    current_step: str

# 2. Define nodes — each is just a function
def step_one(state: GraphState) -> GraphState:
    print("Running step one...")
    return {"current_step": "one"}

def step_two(state: GraphState) -> GraphState:
    print("Running step two...")
    return {"current_step": "two"}

# 3. Build the graph
builder = StateGraph(GraphState)

builder.add_node("step_one", step_one)
builder.add_node("step_two", step_two)

# 4. Connect nodes with edges
builder.add_edge(START, "step_one")   # entry point
builder.add_edge("step_one", "step_two")
builder.add_edge("step_two", END)      # exit point

# 5. Compile into a runnable
graph = builder.compile()

# 6. Run it!
result = graph.invoke({"messages": [], "current_step": ""})
print(result)

LLM Node Pattern

In practice, most nodes call an LLM with the current state:

python

from langchain_openai import ChatOpenAI
from langchain_core.messages import AIMessage

llm = ChatOpenAI(model="gpt-4o-mini")

def llm_node(state: GraphState) -> GraphState:
    # Read from state
    messages = state["messages"]
    
    # Call the LLM
    response = llm.invoke(messages)
    
    # Return state update — only changed fields needed
    return {"messages": messages + [response]}

# Node updates are MERGED into state, not replaced
# (unless you define a custom reducer)

✅ Best Practice Nodes should return only the state fields they modify. LangGraph merges the returned dict into the existing state — you don't need to return the whole state object.

Chapter 08 — LangGraph

State Management

State is the shared memory that flows through your entire graph.

Defining State with Reducers

By default, node updates overwrite state fields. For lists (like message history), you usually want to append instead. Use Annotated reducers for this.

python

from typing import Annotated
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    # add_messages reducer: appends new messages instead of overwriting
    messages: Annotated[list, add_messages]
    
    # These will overwrite on each update (default behavior)
    user_intent: str
    retrieval_docs: list
    is_done: bool

# Now nodes just return the new messages, not the full list:
def my_node(state: AgentState):
    new_msg = llm.invoke(state["messages"])
    return {"messages": [new_msg]}  # will be APPENDED

Checkpointing (Persistent State)

Add a checkpointer to persist graph state across sessions — enabling pause/resume and time-travel debugging.

python

from langgraph.checkpoint.memory import MemorySaver
# For production: from langgraph.checkpoint.sqlite import SqliteSaver

checkpointer = MemorySaver()

graph = builder.compile(checkpointer=checkpointer)

# Each run needs a thread_id to save/restore state
config = {"configurable": {"thread_id": "user-session-42"}}

# First run
graph.invoke({"messages": [{"role":"user", "content":"Hello"}]}, config)

# Second run — state is automatically loaded from checkpoint!
graph.invoke({"messages": [{"role":"user", "content":"What did I say before?"}]}, config)

# View current state
state = graph.get_state(config)
print(state.values["messages"])  # full history!

🔑 Thread IDs Each unique thread_id is an isolated conversation/session. Use per-user or per-session IDs in production. The checkpointer stores and loads state automatically on each invoke.

Chapter 09 — LangGraph

Conditional Edges & Routing

The most powerful LangGraph feature — routing decisions made by your LLM.

Static vs Conditional Edges

Static edges always go from node A to node B. Conditional edges choose which node to go to next based on the current state.

python

from langchain_core.messages import AIMessage

# A router function — reads state, returns the name of the next node
def should_continue(state: AgentState) -> str:
    last_msg = state["messages"][-1]
    
    # If the LLM called tools, go to tools node
    if hasattr(last_msg, "tool_calls") and last_msg.tool_calls:
        return "use_tools"
    # Otherwise, we're done
    return "end"

# Connect using add_conditional_edges
builder.add_conditional_edges(
    "llm_node",          # source node
    should_continue,      # routing function
    {
        "use_tools": "tool_node",  # route name → node name
        "end": END
    }
)

Full ReAct Loop with LangGraph

python

from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode
from typing import Annotated
from langgraph.graph.message import add_messages

class State(TypedDict):
    messages: Annotated[list, add_messages]

# LLM with tools bound
llm_with_tools = llm.bind_tools([get_weather, calculate])

def call_llm(state: State):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

def route_after_llm(state: State) -> str:
    if state["messages"][-1].tool_calls:
        return "tools"
    return "end"

builder = StateGraph(State)
builder.add_node("llm", call_llm)
builder.add_node("tools", ToolNode([get_weather, calculate]))  # auto-executes tool calls

builder.add_edge(START, "llm")
builder.add_conditional_edges("llm", route_after_llm, {"tools": "tools", "end": END})
builder.add_edge("tools", "llm")  # ← loop back after tool execution!

graph = builder.compile()

# The graph will loop: llm → tools → llm → tools → llm → END

💡 The Loop The tools → llm edge creates the ReAct loop. After executing a tool, we go back to the LLM which reads the tool result and either calls another tool or gives a final answer.

Chapter 10 — LangGraph

Building a Full Production Agent

Put it all together: a multi-tool, stateful agent with memory and error handling.

Complete Agent Implementation

python

from langgraph.graph import StateGraph, START, END  # Core graph builder + entry/exit sentinels
from langgraph.prebuilt import ToolNode              # Pre-built node that automatically executes tool calls
from langgraph.checkpoint.memory import MemorySaver  # In-memory checkpointer to persist conversation state across turns
from langchain_openai import ChatOpenAI              # OpenAI LLM wrapper (GPT-4o, GPT-4, etc.)
from langchain_core.tools import tool               # Decorator to turn any Python function into a LangChain tool
from langchain_core.messages import SystemMessage   # Represents the system prompt message type
from typing import Annotated, TypedDict             # Annotated: attach metadata to types | TypedDict: typed dict schema for State
from langgraph.graph.message import add_messages     # Reducer: appends new messages instead of overwriting the list

# ── 1. TOOLS ─────────────────────────────────
@tool
def search_docs(query: str) -> str:
    """Search internal knowledge base for information."""
    # Replace with your actual retriever
    return f"Found relevant docs about: {query}"

@tool
def web_search(query: str) -> str:
    """Search the web for current information."""
    # Replace with actual Tavily/SerpAPI call
    return f"Web results for: {query}"

tools = [search_docs, web_search]

# ── 2. STATE ─────────────────────────────────
class AgentState(TypedDict):
    messages: Annotated[list, add_messages]
    error_count: int   # track errors for retry logic

# ── 3. NODES ─────────────────────────────────
llm = ChatOpenAI(model="gpt-4o").bind_tools(tools)
SYSTEM_PROMPT = "You are a helpful AI assistant with access to search tools."

def agent_node(state: AgentState) -> AgentState:
    """Main reasoning node."""
    messages = [SystemMessage(content=SYSTEM_PROMPT)] + state["messages"]
    try:
        response = llm.invoke(messages)
        return {"messages": [response], "error_count": 0}
    except Exception as e:
        return {"error_count": state["error_count"] + 1}

tool_node = ToolNode(tools)

# ── 4. ROUTING ───────────────────────────────
def router(state: AgentState) -> str:
    if state["error_count"] >= 3:
        return "end"  # give up after 3 errors
    last = state["messages"][-1]
    if hasattr(last, "tool_calls") and last.tool_calls:
        return "tools"
    return "end"

# ── 5. GRAPH ─────────────────────────────────
builder = StateGraph(AgentState)
builder.add_node("agent", agent_node)
builder.add_node("tools", tool_node)

builder.add_edge(START, "agent")
builder.add_conditional_edges("agent", router, {"tools": "tools", "end": END})
builder.add_edge("tools", "agent")

graph = builder.compile(checkpointer=MemorySaver())

# ── 6. RUN ───────────────────────────────────
def chat(user_input: str, session_id: str = "default"):
    config = {"configurable": {"thread_id": session_id}}
    result = graph.invoke(
        {"messages": [{"role": "user", "content": user_input}],
         "error_count": 0},
        config
    )
    return result["messages"][-1].content

# Usage
print(chat("Search for LangGraph documentation", "amir-session-1"))
print(chat("What did I just ask about?", "amir-session-1"))  # memory works!

What's Next?

🔀 Multi-Agent Systems

Use send() to spawn parallel subgraphs. Build supervisor agents that coordinate specialized workers.

⏸️ Human-in-the-Loop

Use interrupt_before to pause before sensitive actions. Resume after human review with graph.invoke(None, config).

📊 LangSmith Tracing

Set LANGCHAIN_TRACING_V2=true to get full step-by-step traces, latency, and cost visibility.

🚀 LangGraph Platform

Deploy graphs as APIs with built-in persistence, streaming, and a visual debugger via LangGraph Studio.

🎓 You've Completed the Course! You now know: LangChain primitives (LLMs, prompts, chains, memory, tools) and LangGraph architecture (state, nodes, edges, conditional routing, checkpointing). You have everything to build production AI agents.

Chapter 11 — Q&A Deep Dive

What does `invoke()` do?

The standard execution method for every Runnable in LangChain's LCEL interface.

Core Idea

invoke is the primary way to execute anything in LangChain. Every object that implements the Runnable interface — LLMs, prompt templates, chains, retrievers, agents — exposes .invoke().

It takes an input and returns an output by running it through the component.

python

# On an LLM
response = llm.invoke("What is RAG?")

# On a prompt template
prompt = ChatPromptTemplate.from_template("Tell me about {topic}")
result = prompt.invoke({"topic": "LangChain"})

# On a full chain
chain = prompt | llm | output_parser
result = chain.invoke({"topic": "LangChain"})

The Runnable Interface

invoke is part of a consistent interface that all LCEL components share — meaning you can swap any component freely and it always responds the same way.

Method	Purpose
`invoke`	Run once, return single output
`batch`	Run on a list of inputs
`stream`	Stream output tokens as they arrive
`ainvoke`	Async version of invoke

invoke in a RAG Pipeline

python

retriever = vectorstore.as_retriever()
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# invoke kicks off the whole pipeline
answer = chain.invoke("What does this document say about X?")

When you call invoke on the chain, it:

1. Passes the input through the retriever to fetch relevant docs 2. Injects them into the prompt template 3. Sends the formatted prompt to the LLM 4. Parses and returns the output

🔑 Key Point Think of invoke as the "run this now, give me the result" method. It replaced the older .run() and .__call__() patterns in LangChain, unifying the interface across all component types under LCEL.

Chapter 12 — Q&A Deep Dive

How does the `|` pipe chain work?

LCEL's pipe operator — the cleanest way to compose LangChain components.

Yes, that's how chains are defined

The | (pipe) syntax is LCEL — LangChain Expression Language. The operator chains components together where the output of one becomes the input of the next.

python

chain = prompt | llm | StrOutputParser()

Reads as: "take a prompt → feed it to the LLM → parse the output as a string"

Step by Step

python

# 1. Prompt template - formats your input into a message
prompt = ChatPromptTemplate.from_template("Tell me about {topic}")

# 2. LLM - takes the formatted prompt, returns an AIMessage
llm = ChatOpenAI(model="gpt-4")

# 3. Parser - extracts just the string text from AIMessage
parser = StrOutputParser()

# 4. Pipe them together into one Runnable
chain = prompt | llm | parser

# 5. Invoke the whole pipeline
chain.invoke({"topic": "RAG systems"})

What flows through

{"topic": "RAG systems"} ↓ prompt "Tell me about RAG systems" (formatted ChatPromptValue) ↓ llm AIMessage(content="RAG is...") (raw LLM response object) ↓ StrOutputParser "RAG is..." (plain string)

Why the | works

Under the hood, | calls __or__ which wraps everything in a RunnableSequence. So these two are equivalent:

python

# Pipe syntax (clean)
chain = prompt | llm | parser

# Equivalent verbose form
chain = RunnableSequence(first=prompt, middle=[llm], last=parser)

More Complex: RAG fan-in

python

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

Here a dict is used to fan-in multiple sources (retrieved docs + the original question) before hitting the prompt. RunnablePassthrough() just passes the input unchanged.

💡 In short | is syntactic sugar for building a pipeline of Runnables, which you then execute with .invoke(). Every step must have compatible input/output types.

Chapter 13 — Q&A Deep Dive

What is State in LangGraph?

State is the shared data object passed between nodes — the "memory" of your workflow at any moment.

Core Idea

Think of state as a snapshot of everything the graph knows right now as it moves through nodes. Every node reads from and writes to this state object.

python

from typing import TypedDict, List

class AgentState(TypedDict):
    messages: List[str]
    current_step: str
    retrieved_docs: List[str]
    final_answer: str

How it flows

START ↓ [Node A] → reads state, does work, updates state ↓ [Node B] → reads updated state, does more work, updates state ↓ [Node C] → reads state, produces final output ↓ END No node talks directly to another — they only communicate through state.

Simple Example

python

class State(TypedDict):
    question: str
    retrieved_docs: str
    answer: str

# Each node receives state and returns ONLY what changed
def retrieve(state: State):
    docs = retriever.invoke(state["question"])
    return {"retrieved_docs": docs}      # only update what changed

def generate(state: State):
    answer = llm.invoke(state["retrieved_docs"])
    return {"answer": answer}

graph = StateGraph(State)
graph.add_node("retrieve", retrieve)
graph.add_node("generate", generate)
graph.add_edge("retrieve", "generate")

Why State matters — 3 key reasons

1. Nodes are decoupled

Nodes don't call each other — they just read/write state. This makes the graph modular and testable.

2. Conditional routing uses state

python

def should_continue(state: State) -> str:
    if state["answer"] == "":
        return "retry"       # go back and try again
    return "end"             # finish

graph.add_conditional_edges("generate", should_continue)

3. State enables memory across turns

In multi-turn agents, state persists the conversation history, tool results, and intermediate reasoning steps across the entire session.

State vs Simple Variables

	Regular Python variables	LangGraph State
Scope	Local to a function	Shared across all nodes
Persistence	Lost after function ends	Survives across node transitions
Routing	Can't drive graph flow	Can conditionally direct edges
Checkpointing	Not built-in	Can be saved/restored automatically

🧠 The Mental Model If LangGraph is a workflow — Nodes are the workers doing tasks, Edges are the routes between workers, and State is the shared document everyone reads and writes as the work progresses. It's the central nervous system of your agent graph.

Chapter 14 — Deep Dive

Annotated, TypedDict & Reducers

Breaking down every word in messages: Annotated[list, add_messages] — from first principles.

The full line, dissected

python

class AgentState(TypedDict):
    messages: Annotated[list, add_messages]

There are 4 concepts in this one line. Let's unpack each one.

1. TypedDict — a typed dictionary

TypedDict is a standard Python type from the typing module. It defines a dictionary with known keys and typed values. It's like a plain dict, but with type hints that tools (and LangGraph) can inspect.

python

from typing import TypedDict

# Without TypedDict — just a regular dict, no type safety
state = {"messages": [], "user": "Amir"}

# With TypedDict — keys and value types are declared
class AgentState(TypedDict):
    messages: list
    user: str

# It still behaves like a dict at runtime:
s: AgentState = {"messages": [], "user": "Amir"}
print(s["user"])   # → "Amir"
print(type(s))     # → <class 'dict'>  (it IS a dict!)

🔑 Key Point TypedDict is purely a type hint tool — it adds zero runtime overhead. At runtime it's just a plain Python dict. LangGraph uses the class definition to understand what fields exist in your state schema.

2. Annotated — attaching metadata to a type

Annotated is also from Python's typing module. It lets you attach extra metadata to a type hint — without changing the type itself.

python

from typing import Annotated

# Syntax: Annotated[actual_type, metadata1, metadata2, ...]
# The type is still "list" — Annotated doesn't change that

x: Annotated[list, "some metadata"]   # x is still a list
y: Annotated[int, "must be positive"]   # y is still an int

# Python itself ignores the metadata at runtime
# But FRAMEWORKS (like LangGraph, Pydantic) can READ it

Think of Annotated[list, add_messages] as saying: "This is a list, AND here's an instruction for LangGraph about what to do when this field gets updated."

3. The default problem — overwrite vs append

Without Annotated, when a node returns an update, LangGraph simply overwrites the field:

python

# State WITHOUT a reducer
class BadState(TypedDict):
    messages: list   # no Annotated — plain list

# Node A sets messages to [msg1]
# Node B returns {"messages": [msg2]}
# Result: messages = [msg2]  ← msg1 is GONE! 💥

# This is a disaster for chat history —
# every node would wipe the previous messages

4. add_messages — the reducer function

add_messages is a reducer — a function that LangGraph calls to decide how to merge a node's returned value into the existing state field.

python

from langgraph.graph.message import add_messages

# What add_messages actually does (simplified):
def add_messages(existing: list, new: list) -> list:
    return existing + new   # append, don't overwrite

# LangGraph calls it like this internally:
# new_state["messages"] = add_messages(old_state["messages"], node_return["messages"])

Without reducer (overwrite): state.messages = [HumanMsg("hi")] ↓ node returns {"messages": [AIMsg("hello")]} state.messages = [AIMsg("hello")] ← history lost! 💥 With add_messages reducer (append): state.messages = [HumanMsg("hi")] ↓ node returns {"messages": [AIMsg("hello")]} state.messages = [HumanMsg("hi"), AIMsg("hello")] ← ✅ preserved

Putting it all together

python

from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    # "messages is a list, and use add_messages to update it"
    messages: Annotated[list, add_messages]
    
    # These use default overwrite behavior (no reducer needed)
    user_intent: str
    is_done: bool

# Node just returns the NEW message(s) — not the full history
def chat_node(state: AgentState):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}  # ← only new msg; reducer appends it

# After 3 turns the state looks like:
# state["messages"] = [
#   HumanMessage("hi"),
#   AIMessage("hello!"),
#   HumanMessage("what's 2+2?"),
#   AIMessage("It's 4."),
#   HumanMessage("thanks"),
#   AIMessage("You're welcome!"),
# ]

add_messages also deduplicates

The real add_messages from LangGraph is smarter than a simple append — it also handles message ID deduplication. If a message with the same ID is returned, it replaces the old one instead of duplicating it. This is useful for tool result updates.

python

from langchain_core.messages import HumanMessage, AIMessage

existing = [HumanMessage(content="hi", id="msg-1")]
new      = [AIMessage(content="hello", id="msg-2")]

result = add_messages(existing, new)
# → [HumanMessage("hi"), AIMessage("hello")]  ← appended ✅

# If IDs match — it REPLACES instead of appending:
update = [HumanMessage(content="hey", id="msg-1")]
result2 = add_messages(existing, update)
# → [HumanMessage("hey")]  ← replaced, not duplicated ✅

Writing your own reducer

You're not limited to add_messages. Any function with signature (existing, new) → merged works as a reducer:

python

# Custom reducer: keep only the last 10 messages (sliding window)
def keep_last_10(existing: list, new: list) -> list:
    combined = existing + new
    return combined[-10:]   # trim to last 10

# Custom reducer: increment a counter
def increment(existing: int, new: int) -> int:
    return existing + new

class MyState(TypedDict):
    messages: Annotated[list, keep_last_10]   # sliding window
    tool_call_count: Annotated[int, increment]  # accumulating counter

✅ Rule of thumb Use Annotated[list, add_messages] for any field that accumulates over time (chat history, tool results, log entries). Use plain types (no Annotated) for fields that should simply be replaced (current intent, a flag, a score).

Chapter 15 — Deep Dive

Abstract Methods in BaseAgent

Why build_graph() and run() have empty bodies — and why that's exactly correct.

The End-to-End Architecture

User ↓ DNS / Geo-routing ← Route53 / Cloudflare ↓ CDN ← optional cache ↓ Load Balancer ↓ Stateless Servers ← API layer ↓ Cache (Redis) ↓ Databases / Storage ← SQL, NoSQL, S3

Abstract Methods Have Empty Bodies on Purpose

python

@abc.abstractmethod
def build_graph(self) -> Any:
    """
    Construct the compiled LangGraph StateGraph...
    """
    # ← NOTHING HERE. Just pass or docstring.

@abc.abstractmethod
def run(self, query: str) -> str:
    """
    Execute the agent graph...
    """
    # ← NOTHING HERE. Just pass or docstring.

This is intentional. These methods are contracts, not implementations.

🔑 What These Methods Are They're CONTRACTS, not IMPLEMENTATIONS. BaseAgent is saying: "I don't know HOW you should run(). I don't know HOW you should build_graph(). But I REQUIRE you to implement them." The empty body means: "This is NOT the real implementation. YOU provide the implementation."

Real Implementations Are in Subclasses

The empty body in BaseAgent is intentionally useless. The real code lives in concrete agent classes:

python · agents/researcher.py

class ResearchAgent(BaseAgent):
    def build_graph(self) -> Any:
        # REAL implementation here
        graph = StateGraph(AgentState)
        graph.add_node("research", self._research_node)
        graph.add_node("analyze", self._analyze_node)
        # ... actual graph building
        return graph.compile()

    def run(self, query: str) -> str:
        # REAL implementation here
        state = self._make_initial_state(query)
        result = self._graph.invoke(state, config=self._get_config())
        return result.messages[-1].content

python · agents/analyst.py

class AnalystAgent(BaseAgent):
    def build_graph(self) -> Any:
        # DIFFERENT implementation here
        graph = StateGraph(AgentState)
        graph.add_node("extract", self._extract_node)
        graph.add_node("synthesize", self._synthesize_node)
        # ... different graph logic
        return graph.compile()

    def run(self, query: str) -> str:
        # DIFFERENT implementation here
        state = self._make_initial_state(query)
        result = self._graph.invoke(state, config=self._get_config())
        return extract_findings(result)

Why the Empty Body?

BaseAgent doesn't know how to implement these methods because:

🔀 Different Agents, Different Graphs

ResearchAgent has: query expansion → retrieval → quality check. AnalystAgent has: extraction → insight detection → reporting. Your agent might have completely different nodes and logic.

📤 Different Agents, Different Outputs

ResearchAgent returns raw research findings. AnalystAgent returns structured analysis. Your agent might return something else entirely.

🤷 The Base Class Can't Guess Your Logic

It can't know what nodes you need, how they connect, what the entry/exit points are, or how to extract the final result. So it says: "You implement this."

✅ Enforced by Python

If you forget to implement build_graph() or run() in your subclass, Python raises TypeError at instantiation — not at runtime. Fail fast.

The Pattern: Contract + Implementation

python

# LEVEL 1: The Contract (BaseAgent)
from abc import ABC, abstractmethod

class BaseAgent(ABC):
    @abstractmethod
    def build_graph(self) -> Any:
        pass  # ← Empty. "You must implement this."

    @abstractmethod
    def run(self, query: str) -> str:
        pass  # ← Empty. "You must implement this."

# LEVEL 2: The Implementation (Your Agent)
class MyResearchAgent(BaseAgent):
    def build_graph(self) -> Any:
        # YOUR CODE HERE
        graph = StateGraph(AgentState)
        graph.add_node("step1", ...)
        return graph.compile()

    def run(self, query: str) -> str:
        # YOUR CODE HERE
        return self._graph.invoke(...)

What the Developer Actually Wrote

python

# In BaseAgent:
@abc.abstractmethod
def build_graph(self) -> Any:
    """
    Construct the compiled LangGraph StateGraph for this agent.
    Called once during __init__. The returned object is stored as
    self._graph and invoked by run().

    Returns:
        A compiled LangGraph graph (result of graph.compile(...)).
    """

This is perfect code because:

Aspect	Why It's Correct
Clearly states what's required	The docstring explains the contract
Explains what should be returned	A compiled LangGraph graph
Enforces implementation	`@abstractmethod` fails if missing
Doesn't pretend to have implementation	No misleading code in the body

Where's the Real Logic?

You need to look at concrete agents to see actual implementations:

file locations

agents/researcher.py          # Real run() and build_graph()
agents/analyst.py             # Real run() and build_graph()
agents/supervisor.py          # Real run() and build_graph()
examples/sequential/graph.py  # Real example implementations

🎓 Key Insight Empty abstract methods are not a "missing implementation bug." They're a deliberate architectural choice that means: "I'm defining a contract. You must implement this. I'm not providing the implementation because each subclass will do it differently."

Key Corrections Table

Statement	Status	Correction
LB distributes across servers	✅	Correct
LB distributes across databases	⚠️	Usually via DB proxies, not LB
LB chooses data center	❌	DNS / geo-routing does this
Data centers store data	✅	Also host compute + networking
Servers host backend/frontend	✅	Correct

Advanced Note — Multi-Region Production Systems

🏗️ Layered Architecture Layer 1: DNS decides region
Layer 2: LB distributes within region
Layer 3: DB replication handles data locality

Final Takeaway

⚖️ Load Balancer

Intra-region traffic distribution (servers)

🌐 DNS / CDN

Inter-region routing (data centers)

🏢 Data Center

Full stack (compute + storage)

🖥️ Servers

Execution layer (APIs, apps, jobs)

LangChain & LangGraph from Scratch

What are we learning?

🔗 LangChain

🕸️ LangGraph

⚡ When to use which

🏗️ They work together

The Stack

Installation

LLMs & Prompt Templates

1. Calling an LLM

2. Prompt Templates

3. The LCEL Pipe Operator

Chains & RAG Pipelines

What is a Chain?

Building a RAG Chain

Output Parsers

Memory & Chat History

The Problem

Approach 1: In-Memory (Simple Chatbot)

Memory Types Comparison

Tools & Agents

What are Tools?

Creating a ReAct Agent

🧠 Quick Check

Why LangGraph?

The Limitations of Simple Agents

🔀 Branching Logic

🔁 Loops & Retries

👤 Human-in-the-Loop

💾 Persistent State

LangGraph's Mental Model

LangGraph vs Simple Agent

Graphs, Nodes & Edges

Your First Graph

LLM Node Pattern

State Management

Defining State with Reducers

Checkpointing (Persistent State)

Conditional Edges & Routing

Static vs Conditional Edges

Full ReAct Loop with LangGraph

Building a Full Production Agent

Complete Agent Implementation

What's Next?

🔀 Multi-Agent Systems

⏸️ Human-in-the-Loop

📊 LangSmith Tracing

🚀 LangGraph Platform

What does invoke() do?

Core Idea

The Runnable Interface

invoke in a RAG Pipeline

How does the | pipe chain work?

Yes, that's how chains are defined

Step by Step

What flows through

Why the | works

More Complex: RAG fan-in

What is State in LangGraph?

Core Idea

How it flows

Simple Example

Why State matters — 3 key reasons

1. Nodes are decoupled

2. Conditional routing uses state

3. State enables memory across turns

State vs Simple Variables

Annotated, TypedDict & Reducers

The full line, dissected

1. TypedDict — a typed dictionary

2. Annotated — attaching metadata to a type

3. The default problem — overwrite vs append

4. add_messages — the reducer function

Putting it all together

add_messages also deduplicates

Writing your own reducer

Abstract Methods in BaseAgent

The End-to-End Architecture

Abstract Methods Have Empty Bodies on Purpose

Real Implementations Are in Subclasses

What does `invoke()` do?

How does the `|` pipe chain work?